Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace CAD *Ranks table calculation with SQL window function #10491

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

gregorbg
Copy link
Member

Kinda-followup to #10412

While testing the queries over in that other PR, I noticed how the RanksSingle and RanksAverage computations were kinda slow, and after staring at the code for a while I realized that it's really just computing numeric rankings per (world, continent, country) each.

Window functions in MySQL have been around for long enough, and MySQL 8 is pretty much standard everywhere now, so I believe it's fair to use this. See https://dev.mysql.com/doc/refman/8.4/en/window-function-descriptions.html#function_rank and https://dev.mysql.com/doc/refman/8.4/en/window-functions-usage.html for details.

We still need to discuss with WRT whether these queries are actually equivalent, so I'm setting this to Draft for the time being.

@gregorbg gregorbg force-pushed the feature/cad-rankings-window-query branch from 0836ee2 to c3431aa Compare December 30, 2024 13:32
@gregorbg gregorbg force-pushed the feature/cad-rankings-window-query branch from f98cc3d to 78e9f54 Compare December 31, 2024 03:20
@Baguettely
Copy link

Baguettely commented Dec 31, 2024

This should work to correctly capture the edge cases for nationality changes. I haven't been able to run a full regression test because a) it times out on phpmyadmin and b) I can't compare the run time for SQL vs Ruby. All the testing I have been able to do has indicated no differences between this and the existing code

# Get all personal records, grouping for each person by event, event+continent, and event+continent+country
WITH personal_bests AS (
   SELECT personId, 
      eventId,
      continentId,
      countryId, 
      MIN(best) AS value
   FROM ConciseSingleResults
   GROUP BY personId,
      eventId,
      continentId,
      countryId
   WITH ROLLUP
   HAVING eventId IS NOT NULL
),
# Calculate all rankings via RANK(). worldRank ranks over the best results for each person by event. continentRank ranks over the best result for each person by event+continent. countryRank ranks over the best result for each person by event+continent+country. If someone changes country and/or continent, their results from the previous region still count towards the rankings for others in that region, however do not count towards their own regional ranking. Regional rankings are marked as NULL if they do not match the person's present region of representation.

ranking_calculations AS (
   SELECT personId,
      eventId,
      value,
      CASE WHEN pb.continentId IS NULL
         THEN RANK() OVER(PARTITION BY eventId
            ORDER BY CASE WHEN pb.continentId IS NULL THEN 0 ELSE 1 END, value)
         END AS worldRank,
      CASE WHEN pb.countryId IS NULL AND co.continentId = pb.continentId
         THEN RANK() OVER(PARTITION BY eventId, pb.continentId
            ORDER BY CASE WHEN pb.countryId IS NULL AND pb.continentId IS NOT NULL THEN 0 ELSE 1 END, value) 
         END AS continentRank,
      CASE WHEN pb.countryId IS NOT NULL AND p.countryId = pb.countryId
          THEN RANK() OVER(PARTITION BY eventId, pb.countryId
             ORDER BY CASE WHEN pb.countryId IS NOT NULL THEN 0 ELSE 1 END, value)
          END AS countryRank
   FROM personal_bests pb
   JOIN Persons p
      ON p.wca_id = pb.personId
         AND p.subId = 1
   JOIN Countries co
     ON co.id = p.countryId
)

# Group again by person to combine the world/continent/country rankings for each person. If someone has changed region of representation, their ranking is per the fastest result achieved under their present region. If they have not achieved any results in an event since changing their region, their rank is stored as 0.
SELECT personId,
   eventId,
   MIN(value) AS best,
   MIN(worldRank) AS worldRank,
   IFNULL(MIN(continentRank), 0) AS continentRank,
   IFNULL(MIN(countryRank), 0) AS countryRank
FROM ranking_calculations
GROUP BY personId,
   eventId
ORDER BY eventId,
   worldRank

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants